PATENT APPLICATION 

What is claimed is: 



CLAIMS 



SOO-083 



1. A program storage device accessible by a computer, tangible embodying a program of 
5 instructions executable by said computer to perform method steps for protein structure 

alignment, said methods steps comprising of: 

(a) receiving a first protein with N x atoms; 

(b) receiving a second protein with N 2 atoms; 

(c) making an initial alignment of said atoms of said first protein to said atoms of said second 
10 protein; 

O (d) calculating all atomic distances between said coordinates of said atoms of said first protein 

OS and said atomic coordinates of said atoms of said second protein; 

Sift ?. 

yl (e) defining a matrix with a plurality of binary assignment variables wherein each binary 

j!; assignment variable corresponding to either a match or to a gap; 

'Jl5 (f) defining one or more mean field equations wherein said plurality of binary assignment 

p variables are replaced by a plurality of continuous mean field variables, whereby said each 

mean field variable has a value between 0 and 1, and a plurality of forces that are 
H proportional to said atomic distances squared; 

(g) formulating an energy function, wherein: 
20 said energy function includes a first cost for each said atomic distance wherein said 

distance is a weighted body transformation using said continuous mean field variables 
of said first protein while keeping said second protein fixed; 

said energy function includes a second cost X for each said gap by either said first 
protein or said second protein, 
25 said energy function includes a third cost 5 for a position-independent consecutive said 

gap; 
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said energy function includes a fourth cost for enforcing a constraint to satisfy that each 
said atom of said first protein either aligns with said atom of said second protein or to 
said gap; 

(h) minimizing by an iterative process of said energy function and updating said continuous 
mean field variables in said mean field equations for a decreasing set of temperatures T until 
convergence to a predefined convergence value is reached; and 

(i) after convergence rounding off said continuous mean field variables to either 0 or 1. 

2. The method as set forth in claim 1, wherein said step of formulating an energy function 
further comprises a fifth cost for discouraging crossed matches. 

3. The method as set forth in claim 1, wherein said second cost X is a value between 0.01 
and 0.5. 

4. The method as set forth in claim 3, wherein said second cost X for a a- site in a a-helix 
has a larger said second cost A, by a factor between 0.01 and 0.5 of said second cost X. 

5. The method as set forth in claim 3, wherein said second cost X for a f3-sheet has a 
larger said second cost X by a factor between 0.01 and 0.5 of said second cost X. 

6. The method as set forth in claim 1, wherein said third cost 8 is a function of said 
second cost A, divided by a value between 1 and 20. 

7. The method as set forth in claim 1, wherein fourth cost includes a parameter y with a 
value between 0 and 0.2. 
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8. The method as set forth in claim 1, wherein said step of minimizing by an iterative 
process includes an iteration parameter e with a value between 0.5 and 0.95. 



9. The method as set forth in claim 1, wherein said step of minimizing by an iterative 
process further comprises the step of initiating said temperature to a value between 1 
and 100. 

10. A method of using a mean field approach for protein structure alignment, comprising the 
steps of: 

(a) providing a first protein with N x atoms; 

(b) providing a second protein with N 2 atoms; 

(c) making an initial alignment of said atoms of said first protein to said atoms of said second 
protein; 

(d) calculating all atomic distances between said coordinates of said atoms of said first protein 
and said atomic coordinates of said atoms of said second protein; 

(e) defining a matrix with a plurality of binary assignment variables wherein each binary 
assignment variable corresponding to either a match or to a gap; 

(f) defining one or more mean field equations wherein said plurality of binary assignment 
variables are replaced by a plurality of continuous mean field variables, whereby said each 
mean field variable has a value between 0 and 1, and a plurality of forces that are 
proportional to said atomic distances squared; 

(g) formulating an energy function, wherein: 

said energy function includes a first cost for each said atomic distance wherein said distance 
is a weighted body transformation using said continuous mean field variables of said first 
protein while keeping said second protein fixed; 
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said energy function includes a second cost A, for each said gap by either said first protein or 
said second protein, 

said energy function includes a third cost 8 for a position-independent consecutive said gap; 
said energy function includes a fourth cost for enforcing a constraint to satisfy that each said 
atom of said first protein either aligns with said atom of said second protein or to said gap; 

(h) minimizing by an iterative process of said energy function and updating said continuous 
mean field variables in said mean field equations for a decreasing set of temperatures T until 
convergence to a predefined convergence value is reached; and 

(i) after convergence rounding off said continuous mean field variables to either 0 or 1. 

11. The method as set forth in claim 10, wherein said step of formulating an energy 
function further comprises a fifth cost for discouraging crossed matches. 

12. The method as set forth in claim 10, wherein said second cost X is a value between 
0.01 and 0.5. 

13. The method as set forth in claim 12, wherein said second cost X for a a-site in a oc- 
helix has a larger said second cost X by a factor between 0.01 and 0.5 of said second 
cost X. 

14. The method as set forth in claim 12, wherein said second cost X for a p -sheet has a 
larger said second cost X by a factor between 0.01 and 0.5 of said second cost X. 

15. The method as set forth in claim 10, wherein said third cost 8 is a function of said 
second cost X divided by a value between 1 and 20. 
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16. The method as set forth in claim 10, wherein fourth cost includes a parameter y with a 
value between 0 and 0.2. 



17. The method as set forth in claim 10, wherein said step of minimizing by an iterative 
process includes an iteration parameter 8 with a value between 0.5 and 0.95. 

18. The method as set forth in claim 10, wherein said step of minimizing by an iterative 
process further comprises the step of initiating said temperature to a value between 1 
and 100. 
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